Skip to content

Enable checksum validation for presigned URL downloads#7035

Open
jencymaryjoseph wants to merge 2 commits into
feature/master/pre-signed-url-getobjectfrom
jencyjos/pre-signed-url/checksum-fix
Open

Enable checksum validation for presigned URL downloads#7035
jencymaryjoseph wants to merge 2 commits into
feature/master/pre-signed-url-getobjectfrom
jencyjos/pre-signed-url/checksum-fix

Conversation

@jencymaryjoseph

Copy link
Copy Markdown
Contributor

Motivation and Context

S3 returns checksum headers (x-amz-checksum-crc32, x-amz-checksum-crc64nvme, etc.) only when the request includes x-amz-checksum-mode: ENABLED as an HTTP header. For presigned URL downloads, this header was not being sent because:

  1. The checksum interceptors (HttpChecksumValidationInterceptor) check request instanceof GetObjectRequest, but presigned URL downloads use PresignedUrlDownloadRequestWrapper — a different type — so the interceptor short-circuits.
  2. The presigned URL path skips normal request marshalling — the URL IS the pre-marshalled request, and the SDK just sends a GET without adding SDK-level headers.
  3. The header is signed into the URL at presigning time (listed in X-Amz-SignedHeaders), but the download path didn't detect this and send the required header.
    This meant S3 never returned checksums for presigned URL downloads, making data integrity validation impossible.

Modifications

This change adds checksum header auto-detection in the request marshaller and enables response checksum validation through the existing SDK interceptor pipeline.

  • Marshaller (PresignedUrlDownloadRequestMarshaller):
    • Parses X-Amz-SignedHeaders from the presigned URL. If it contains x-amz-checksum-mode, adds the x-amz-checksum-mode: ENABLED header to the outgoing request.
    • No user configuration required — the SDK detects and handles this transparently.
  • Execution attributes (DefaultAsyncPresignedUrlExtension):
    • Sets HTTP_CHECKSUM execution attribute with requestValidationMode("ENABLED") and the full response algorithms list (matching the codegen-produced GetObject operation).
    • Enables HttpChecksumValidationInterceptor to validate response checksums. On mismatch, throws SdkClientException.
  • Javadocs:
    • AsyncPresignedUrlExtension (interface-level): Documents checksum validation behavior and limitations.
    • S3Presigner.presignGetObject(): Documents that enabling checksum mode makes the URL non-browser-executable.

Testing

  • Unit tests (PresignedUrlDownloadRequestMarshallerTest): Verifies header is added when x-amz-checksum-mode is in SignedHeaders, and not added otherwise.
  • WireMock tests (PresignedUrlChecksumValidationWiremockTest): Verifies matching checksum succeeds, mismatching checksum throws SdkClientException, and missing checksum skips validation.
  • Mock tests (DefaultAsyncPresignedUrlExtensionTest): Verifies download completes successfully when checksum headers are present.
  • Integration tests (AsyncPresignedUrlExtensionTestSuite): Verifies against live S3 that checksums are returned when checksum mode is enabled and not returned otherwise.

Screenshots (if appropriate)

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

Checklist

  • I have read the CONTRIBUTING document
  • Local run of mvn install succeeds
  • My code follows the code style of this project
  • My change requires a change to the Javadoc documentation
  • I have updated the Javadoc documentation accordingly
  • I have added tests to cover my changes
  • All new and existing tests passed
  • I have added a changelog entry. Adding a new entry must be accomplished by running the scripts/new-change script and following the instructions. Commit the new file created by the script in .changes/next-release with your changes.
  • My change is to implement 1.11 parity feature and I have updated LaunchChangelog

License

  • I confirm that this pull request can be released under the Apache 2 license

@jencymaryjoseph jencymaryjoseph requested a review from a team as a code owner June 13, 2026 00:00
@jencymaryjoseph jencymaryjoseph force-pushed the jencyjos/pre-signed-url/checksum-fix branch from 74d413d to f3cc9d7 Compare June 15, 2026 00:10
}
for (String param : query.split("&")) {
if (param.startsWith("X-Amz-SignedHeaders=")) {
return param.substring("X-Amz-SignedHeaders=".length()).contains("x-amz-checksum-mode");

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember exactly how all of the signing works, but are these guaranteed to be lower case?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, V4CanonicalRequest.getCanonicalHeaders() lowercases all header names per the SigV4 specification, so X-Amz-SignedHeaders always contains lowercase values.

if (query == null) {
return false;
}
for (String param : query.split("&")) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the query parameters here are URL encoded - I'm trying to think through if there are any cases where we would end up with something invalid because we're not actually decoding these first.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use uri.getQuery() which returns the decoded query string, so parsing operates on decoded values.

private static final HttpChecksum RESPONSE_CHECKSUM_CONFIG = HttpChecksum.builder()
.requestValidationMode("ENABLED")
.responseAlgorithmsV2(
DefaultChecksumAlgorithm.XXHASH3,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little worried about the hard coding of these algorithms here. It would be easy for us to forget to update this with new algorithms.... is there any way we could create it dynamically?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add a new public method in DefaultChecksumAlgorithm to return all checksum algorithms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants